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Abstract 

We discuss five simple functions on finite multisets of metric spaces. 
Tlie first four are all metrics iff the underlying space is bounded and are 
complete metrics iff it is also complete. Two of them, and the fifth func- 
tion, all generalise the usual Hausdorff metric on subsets. Some possible 
applications are also considered. 

1 Introduction 

Metrics on subsets and multisets (subsets-with-repetition-allowed) of metric 
spaces have or could have numerous fields of application such as credit rating, 
pattern or image recognition and synthetic biology. We employ three related 
models (called E, F and G) for the space of multisets on the metric space {X, d). 
On each of E, F we define two closely-related functions. These four functions 
all turn out to be metrics precisely when d is bounded, and are complete iff d 
is also complete. Another function studied in model G has the same properties 
for (at least) uniformly discrete d. X is likely to be finite in many applications 
anyway. 

We show that there is an integer programming algorithm for those in model 
E. The three in models F and G generalise the Hausdorff metric. Beyond 
finiteness, no assumptions about multiset sizes are required. 

Various types of multiset metric on sets have been described |1.3| . but the 
few that incorporate an underlying metric only refer to R or C. The simple and 
more general nature of those described here suggests that there may be other 
interesting possibilities. 

In this section, after setting out notation and required background, we men- 
tion briefiy the existing work in this field. The following three sections are each 
dedicated to one of E, F and G. 

1.1 Notation: metric spaces 

R is the non-negative reals, N includes 0, and {X,d) is a metric space of more 
than one element, d is uniformly discrete if 3a > such that d{x, z) > a 
whenever x ^ z, and two metrics on X are equivalent if they induce the same 
topology. 

d is complete iff every Cauchy sequence converges (to a point of X), and d 
is compact iff every sequence, Cauchy or not, has a subsequence that converges 
to a point of X. 

*MSC primary 51F99,03E70; secondary 54E50,62H30,68T10,91B12,92C42. 
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The well-known Hausdorff metric dn on the space H of all non-empty com- 
pact subsets of X is defined ior A, B € H hy 

dniAjB) = max(maxmin(i(x, ?/), maxmin(i(a;, J/)) 

in which compactness guarantees that all these extrema are attained. We 
use later the simple fact that if d does not satisfy the triangle inequality, neither 
does dn- It is a standard fact |Edg90l pp. 71-72] that dn is complete if d is. The 
converse is also tru^H- A convenient heuristic (for finite A, B) is to label the rows 
(the columns) of a matrix by the elements of A (of i3), with the corresponding 
d-distances as entries. Then dH{A,B) is the largest of all row and column 
minima. 

Given an equivalence relation ^ on X, and a,/? G X/ ^ write 

D{a,l3) = mfd{a,pi) + d{qi,p2) + (i((?2,P3) + • ■ • + d((7„_i,p„) + d{qn,b) 

where n£N, aGa, and pi ^ qi for each i. In general D is a, 

pseudometric on X/ ^, that is D(a, = ^ a — l3, though D does satisfy the 
other metric axioms. Clearly Z?(a, /3) < iidaea,bei3 d{a,b). To simplify notation, 
we adopt the conventions that a = qo,b = Pn+i and pi Pi+\ for any i. 

1.2 Notation: multisets 

A recent survey article on multisets and their applications is |SIYS07| . The 
notation and terminology in this article mostly follow |DD09| and |Pet97| . A 
convenient definition of multiset also introduces the model [Section [2]. 

A multiset of a set S* is a function e : 5— N taking each s e 5 to its 
multiplicity e(s). The root set R{e) of e is {s G 5 : e(s) > 0}, always assumed 
finite. The cardinalit'^ of e is C(e) = X^ses^l'*)- ^ '^^ functions 
of finite support from 5 to N. 

We denote by Cs, for s S, the multiset consisting of a single copy of s and 
define cq by i?(eo) — (j). Naturally any multiset has a unique form X^ses e(s)es; 
we can add or subtract them if all the arithmetic is within N. 

E forms a lattice under the operations fl and U defined for e, / G E hy 
en/(s) — min(e(s), f(s)) and eU/(s) = max(e(s), f{s)). The multiset difference 
6/ is e — e n /, and e and / are disjoint if e H / — bq. For instance e / and /e are 
disjoint. The symmetric difference of e and / is eA/ = e/-|-/e = eU/ — efl/. 
e is a submultiset of /, written e C /, if e(s) < /(s)Vs and of course this is 
equivalent to e n / = e or Cf — cq. 

A function h from e to / is simply a function h from R{e) to R{f). to 
guarantee that identical elements of e are not mapped to distinct elements of /. 
We say that h is an injection (resp. surjection, bijection), according as (i) its 
restriction to the root sets has this property in the ordinary sense, and (ii) for 
every s G R{e), e(s) < f{h{s)) (resp. e{s) > f{h{s)), both of the preceding). 

^Let Xi be a non-convergent Cauchy sequence in X so that Si = {xi} is Cauchy in H with 
putative limit S €i H, so S is non-empty. If 5 = {x} then d(xi,x) — > 0. Thus S contains 
distinct a, b e X. But then dgiSj, S ) > ma x(d(a:^, a), d(xi,b)) > > d{a^^ 

■^Called the counting measure in |DD09I . 
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1.3 Other metrics on multisets 



We give a short account of the multiset metrics hsted at |DD091 pp. 51-52], 
described elsewhere in that book, and regrouped here according to the main 
idea. 

• The matching distance \T)T)09[ p. 47] is defined by iTdgmaxxeed{x,g{x)) 
where g runs over all (multiset) bijections from e to /. These are used in 
size theory (image recognition), where a geometric trick is used to ensure 
that bijections are always defined. A survey article is [dFL06j. 

• The metric space of roots |DD091 p. 221] is defined on multisets of C of 
fixed cardinality n, each identified with the monic polynomial of which it 
is the set of roots. Two such ui, . . . , m„ and «!,...,?;„ are separated by 
miup maxi<j<„ \uj — Vp(^j) \ as p ranges over the permutations of 1, . . . , n. 
More details are in |CM06) . 

• Petrovsky has defined several metrics [DD09[ p. 52] on E using a measure 
^x : E ^ M., fi{e) = E^es •^(s)e(s) where A : 5 ^ M+. Thus ^ = C 
when A = 1. One of them is d{e, f) = p,{eAf) = /i(e/) + /i(/e) and the 
others are variants |Pet97l [Pet03| . They are related to the Jaccard and 
Hamming metrics on sets |DD09l p. 299, p. 45], and seem to be primarily 
used in cluster analysis (decision making). 

• The ^-metrzc |DD09l p.281] on so-called phylogenetic X-trees (computa- 
tional biology), again is based on symmetric difference. See [CRV09ij for 
more details. 

• The bag distance |DD09[ p. 204], used in string matching, is defined to be 
niax(C(e/),C7(/e)). 

• In approximate string matching (for instance in bioinformatics) , so-called 
q-gram similarity |DD09l p. 206] is defined. This is not a metric. 

Note that there are two dominant ideas: minimising over multiset bijections, 
and symmetric differences. The latter do not refiect any structure on S except 
perhaps if we argue that multiplicity may depend on that structure. To some 
extent, the metrics described later mix these two paradigms. 

There are a number of other standard possibilities, such as the metric in- 
duced on E by any injection into a metric space, or those given by taking the 
sum (or the supremum) of the |e(s) — /(s)| where e, f € E and s € S. Any 
metric on (multisets on the prime numbers) is also an example. 

2 The multiset model E 

If a, c e E and C{a) < C(c), we find a submultiset c' of c of cardinality C{a) 
so that, matching elements of a and c' as described below, the sum of the d- 
distances is minimised, and then we add a constant. The result, denoted d^, 
though resembling the matching distance just described, actually generalises the 
bag distance. The other function, dEm ("i for 'mean') is obtained by dividing 
ds by C(c). 
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We choose M > 0, and define 9 = when d is bounded. Given a,c e E, 
suppose that C(a) < C(c) and c ^ cq. Write down all the elements in both in 
arbitrary order, viz., ai, 02, . . . , ac(a) and ci, C2, . . . , cc(c) where for each x G X, 
#{i ■ o.j = ^} = '^(■^) ^'^'i "^{.1 ■ = x} = c{x). (In other terminology, we 
parametrise the multisets by enough positive integers.) 

Let 7 be a member of the permutation group Gc on C(c) elements, acting 
on the subscripts in the c-sequence. Write 

j=C(a) 

cr'{a,c)= d{aj,c^(j)) + M\C{c)-C{a)\ 

and define the following functions ds and rf^m from E x E to R. 



dE{a,c) = min d'' {a, c) and dEm{a,c) — '^^^ 



^sGe ' ' ' ' ' max(C(a), C(c)) 

with dEm(eo,eo) = 0. We call M\C{c) — C(a)| the notional part of dE{cL,c). 
The mappings 7 regarded as from a to c, need not be multiset functions. 

Proposition 1. If d is unbounded then ds and dEm are non-metrics for all M. 
If d is bounded, then ds is a metric iff 6 < 2, and dEm is a metric iff <1. 

Proof. Only the triangle inequality need be verified or could fail. Let x,y G X, 
with X ^ y. Then 

dE{ex,ex + ey) + dE{ex + ey,ey) - dE{ex,ey) = 2M - d{x,y) 

So if dE is a metric, d is bounded and 6 < 2. The same argument for dEm 
implies <\. 

From now on we take a, 6, c S E, and assume C(a) < C(c). We look first at 
dE and suppose 6 <2: as motivation, we could verify that whenever 

2C{b) < C{a){2 - e) or 2C{h) > eC{a) + 2C{c) 

then the notional parts alone in dsia, b) + ds^b, c) add to at least 

M(C(c) - C(a)) + eMC{a) > dsia, c) 

The value of C (5) determines three cases, all with similar reasoning. 
Case C{b) < C{a): there exist a G Ga and 7 e Gc such that 

C{b) 



i=l 



and 



C(6) 

dE{b, c) = Y^ dibi, c^(i)) + M(C(c) - C{b)) 



i=l 



So 



C{b) 

dE{a, b) + dE{b, c) > ^ d(a„(i) , c^(i)) + M((7(c) - C(a)) + 2M{C{a) - (7(6)) 
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C(a) 

> J2 d(«a(,)> SW) + MiC{c) - Cia)) + (2 - e)MiCia) - Cib)) (1) 
1=1 

having added, for each i beyond C(6), the non-positive d{aa(i),c^(i)) — OM. 
Then ([1]) is at least 

dEia, c) + M{2 - 0){C{a) - C{b)) > dsia, c) 
Case C(a) < C(6) < C(c): there exist (3 £ Gb and 'f £ Gc such that 

C(a) 

d£(a, &) = ^ dK, &^«) + M{Gib) - C(a)) 

z=l 

and 

C(b) 

dE{b,c) = ^ d(6,;,c^(,)) +M(C(c) - 

i=l 

Then 

C(a) C{b) 

dEia, h) + c) = M(C(c) - C(a)) + ^ d(a,, &^(,)) + ^ d{h,c^(^) (2) 

Now YhS^i d{h,c^(^o) = LjL^i^ ^(^/3(»);S/3(j)) si'^'^^ ^ "^fc' ^'^^ s° ® is ^* 
least 

C(a) C(b) 

M{G{c) - C{a)) + ^ d(ai,c^/3(,)) + ^ rf(&/3(j), c^/3(*)) 

i=l i=l+C(a) 

which is at least dE{a,c), in this case for any 9. 
Case C(&) > C(c): For some r 6 Gc, 

C(a) 

dEia, c) = M(C(c) - C(a)) + ^ d(a,, c^(,)) 

i=l 

and p,a E Gb are given by 

C(a) 

d£;(a, 6) = ^ dia,, + M(C(&) - C(a)) 

and 

C(c) 

C) = J2 dih.(^).C^) + A/(C(6) - C(c)) 

1=1 

We write w = cr^^p G Gb, which takes any subscript of a to a subscript of c, 
and define 

/ = #{1 < i < da) : pii) = cr(j) for some j in 1, ... , C(c)} 
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Then / > C{a) + C(c) - C{h) since p(l), . . . , p{C{a)) and cr(l), . . . , ct(C(c)) are 
all chosen from 1,2,..., C(&). Dropping all terms with i > I, dE{a, b) + dsib, c) 
is at least 

/ 

^M(a.,6pW) + d(V^),c^w)] + 2M(C(6) - C(c)) + M(C(c) - C{a)) (3) 
Just as before, d(ai, Ctj(i)) — < 0, so ([21) is at least as big as 

C(a) 

^ d(a„ c^(,)) + 2M(C(6) - C(c)) - 0M(C(a) - + A/(C(c) - C(a)) 

i=l 

Now C(5) - C(c) > C(a) - Z > so we get 

C(a) 

dEia,b)+dE{b,c) > dK,c„(,))+Af(C(c)-C(a))+Af(2-0)(C(a)-Z) > rf£;(a,c) 
i=i 

concluding the proof that is a metric. 

Passing to dErm we now assume 9 < 1, which implies dEmio-Tc) < M. If 
C(b) < C(c) it is certainly true that 

dEm{a, b) + dEm{b, c) > dE,n{a, c) 

so we will suppose C{b) > C(c) and reuse the notation just employed for dE- 
Using ^ again, we can write 

C{b){dEm{a, b)+dE„^{b, c)) > {C{c)~l)M+Y^ d{a,, c^^,-,)+{2C{b)-2C{c)-C{a)+l)M 

i=l 

Since 6 < 1, the sum of the first two terms on the right is at least C{c)dEm{o,, c) 
and we also have 2C(b) - 2C(c) - C{a) + I > C(b) - C(c) > 0, so 

, I , ^ C{c)dEra{a. c) + M{C{h) - C{c)) ^ ^ 

dEm[a, b) + dEm(b: c) > -^j^^ > dEm{a, c) 

as it is a convex combination of dEm{o-,c) and M. □ 
2.1 Simple properties of cIe and dEm 

We start with some computational results about ds- The first says that a and 
c can be taken as disjoint. 

Proposition 2. dE{a,c) — dE{ac,Ca) 
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Proof. Assume C{a) < C{c). We have to show that among the permutations 7 
in Gc which minimise 

C(a) 

(r{a,c) = d{aj,c^^j))+M\C{c) - C(a)| 

there exists one in which maximally many identical elements (with multi- 
plicity) of a and c are matched up by 7. But if aj = c^(k) then 

d{aj, c^(fe)) + d{ak, c^{j)) < d{aj,c^(^j)) + d{ak, c^(k)) 

is certainly true, so if we start with any 7 that minimises rf^(a, c), we can find 
another with the required property. □ 

Corollary 3. If jj is the discrete metric then dE{a,c) = M max{C{ac),C{ca)), 
and so ds generalises the bag distance. 

The next result is needed to establish completeness. 

Lemma 4. If x,y G X , a,c € E and C{a) = C{c) = n, then 

Idsia + ex,c + e^) - ds^a, c)| < d{x, y) 

Proof, dsia + ex,c + Cy) < d{x, y) + dsia, c) because its right side is obtained 
from its left side by permuting the subscripts in the sense of the definition of 
ds- 

Now, renumbering so as to identify x as ai and y as c„+i (if these subscripts 
were the same we would be finished) suppose that 

n 

dE{a + ex,c + Cy) = d{x,ci) + d(a„+i,y) + y^^d{aj,Cj) 

3=2 

Then 

n 

dsia + 6^,0 + ey) + d{x, y) > rf(a„+i, ci) + ^ d{aj, cj) > dsia, c) 

3=2 

as required. Simple examples show that the bound d{x, y) is tight. □ 

Finally we compare sequences in ds and dEm- 

Proposition 5. Let Si he a sequence in E. Then any of the following is true 
with respect to dE iff it is true with respect to dEm- (i) Si is Cauchy; (ii) Si is 
convergent; (Hi) Si has limit I € E. 

Proof. We first show that ds and dEm have the same Cauchy sequences. Since 
multisets of cardinalities r and t are at least M\t — r\ apart in ds, it follows 
that any Cauchy sequence for ds must eventually have constant cardinality, in 
which case dE and dsm are mutually proportional and so the sequence is also 
Cauchy for dsm- 

Now suppose Si is Cauchy for dsm, write Sj = C{Si) and then for each 
e > 0,3N = N{e) such that whenever i,j > N, dEm{Si,Sj) < e. But then 
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dEm{Si, Sj) > M(l — — ) supposing Si > Sj and as ^ > 1 — -^g , no subsequence 
of the Si can go to infinity, and hence the sequence Si is bounded (for each Sj , and 
hence in general). But then M(l — — ) only takes finitely many positive values 
so for sufficiently small e this gives a contradiction unless the Si are eventually 
constant. So cIe and dEm are again proportional and Si is Cauchy with respect 
to dE- (There is also a trivial case in which Si = infinitely often.) 

An exactly similar argument shows that any limit of such a sequence (either 
metric) again has the same cardinality. It follows that cIe and dEm also have 
the same convergent sequences (and limits). □ 

We are now ready for the main result. 

Proposition 6. (Topology and completeness.) 

cIe and dEm induce the same topology on E. The metrics dE and dEm o,fe 
complete iff d is. 

Proof. By ([5]), dE and dEm have the same convergent sequences (and limits), 
and so induce the same topology on E. We also see that given d, either both or 
neither of ds and dsm are complete metrics. 

If Xi is a non-convergent Cauchy sequence in X, then Si = {xi} is a non- 
convergent Cauchy sequence for both dE and dEm- 

Supposing that d is complete, let Si be a sequence of multisets of X which 
is Cauchy in dE, with all C{Si) = n > 1 (the completeness of d implies the case 
n = 1). Given e > 0, 37V = N{e) such that m> N =^ dE{Sm,SN) < £■ As 
every elemenl[f| of each Sm for m > is then within d-distance e of some element 
of Sn it follows that there exists a totally bounded region of X containing all 
elements of all the 5,;. Since X is complete, the completion of this region is 
(can be regarded as) a compact subset of X and now we can assume that X is 
compact. 

Recalling that a Cauchy sequence converges iff it has a convergent subse- 
quence, we select an arbitrary Xi from each Si (using the axiom of choice). 
Since X is compact, the sequence Xi has a convergent subsequence yi = Xt[i) 
with limit y (say). Writing T,; for S't(i), we denote by T/ the multiset T,; — By . 
Using (HI we have 

\dE{T^,T,)-dE{TiT'^\<d(y,,y,) 

and it follows that T/ is a Cauchy sequence of cardinality n — 1, and we can 
assume that T/ has limit T' . Using ^ again, and denoting T' + Cy by T, 

\dEiT,,T)~dE{TiT')\<diy,,y) 

and so Ti converges to T, which is therefore the limit of the Cauchy sequence 

S^. □ 

2.2 An algorithm for dE 

We show that calculation of is an integer programming problem. As usual 
suppose C(a) < C(c) and a D c — cq. Just as in the HausdorfF heuristic, label 
the rows (the columns) of a matrix by elements of R(a) (of i?(c)), and put the 

^As always, this is with multiplicity. If some element of X occurs three times in 5]v, then 
at least three elements (with multiplicity) of each Sm are within d-distance e of it. 
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rf-distances as entries. Add one more row whose entries are all M, to give a 
matrix D. 

Define a new matrix H, the same shape as D, constrained to satisfy 
hij = c{j) for j < #R{c) and hij = a{i) for i < #R{a) 

i 3 

implying hi^^ii(^a),j = C'(c) — C{a). Then dE{a,c) is the minimum value of 
j dijhij (the trace of D^H), for which all the hij £ N. 

3 The multiset model F 

We will define a space A whose finite subsets include the multisets of X. 

This time we identify the multiset rex with (x, r) G X x N, as usual inter- 
preted as "r copies of x". Let A be the quotient space of X x N in which all 
points of the form (a;, 0) have been identified. Ar will denote the (quotient of 
the) subset X x {r}. We use N instead of Z+ (which would be simpler) to get a 
canonical bijection with model E. Note that A consists of the isolated point eo 
and isolated copies of X; furthermore A coincides with {e G E : #R{e) < 1}. 

Hence a multiset of X is a finite subset U oi A whose underlying elements 
of X are all distinct, viz. re^, sCx € U r = s and F will denote the space of 
all such subsets of A. The following result should now be obvious. 

Proposition 7. Let d' be any metric on A. Then the restriction of d'jj to F is a 
multiset metric on X, and it generalises the Hausdorff metric iff d' {lex, le^) = 
d{x, t/)V.x, y E X . 

We will return later to the question of when this is complete. 
For metrics on A, as before fix M > and define = when d is 
bounded. We start with the functions dA and dAm from Ax A to R defined by 

dA{rex,tez) = M\t — r\+ min(r, t)d{x, z) 

and 

7 / \ dA{rex,tez) _ , 

dAm{rex.,tez) = , or when r = t = 

max(r, t) 

Noting that (a) these are well-defined on Ax A, (b) they are the respective 
restrictions to ^ x ^ of rf^ and d_Em, and (c) they both agree with d when 
r = t = 1, it follows that they arc metrics on A when 6 < 2 and when 6 < 1 
respectively. Actually there is a small surprise. 

Proposition 8. If d is unbounded then dA and dAm di'e non-metrics for all M. 
If d is bounded, then dA and dAm (ire both metrics iff 9 < 2. 

Proof. As dA{2ex,ex) + dA{ex,2ez) - dA{2ex,2ez) = 2M - d{x,z), if dA is a 
metric, d must be bounded and 9 <2. Use the same example for dAm- 

It only remains to show that dAm is a metric when 6 <2. We fix rex,tez € A, 
assuming r < t. Now if s < i, it is immediate that 

dAm{rex,sey) + dAm{sey,tez) > dAm{rex,tez) 
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so we will take s > t. Using the definition of dAm, 
st{d 

Am y'^^x 1 ) + dAni {sey,tez) - dAra {rex,tez)) (4) 

= M{t + r){s -t) + rtd{x, y) +t^d{y,z) - rsd{x, z) 
and using > rt we get that ^ is at least as large as 

Mt{s -t)+ r{s - t)(M - d{x, z)) 

which is non-negative provided 2M > ■^^d{x, z), whose right side cannot 
exceed supd = 9M. So dAjn is a metric when 6 <2. □ 

Remark 9. If r < t, AI(t ~ r) < tdAm{f&XTi&z) — dAirexjiez) < tMmax(l,0). 
Actually, dAmifSx, i&z) is a convex combination of d{x, z) and M and therefore 
lies between them. 

Proposition 10. Let riex{i) be a sequence in A. Then any of the following is 
true with respect to dA iff it is true with respect to dAm- (i) fiGx{i) is Gauchy; 
(a) riex(i) is convergent; (Hi) riex[i) has limit I G A. 

The proof is exactly as in ([S]). 

Proposition 11. (Main properties of A) 

1. dAm and dA both induce the same topology on A, coinciding with the 
quotient topology inherited from X x N. 

2. dA and dAm are complete metrics iff d is. 

3. The subset U of A \s compact iff each Ur = U r\ Ar is a, compact subset 
of Ar, and almost all the Ur are empty. 

Proof. (Clause 1) We have just seen that dA and dAm have the same convergent 
sequences and limits, so they induce the same topology. 

Let rexjtcz S A with i > and choose e > 0. Now Af|r — i| < dA{rex,tez) < 
e implies r = t when e is sufficiently small, and indeed in this case 

dA[tex, tcz) < e <^ d{x, z) < j 

It follows that any sufficiently small open ball around tcz in the d^-topology 
is also an open ball in the quotient topology, and vice versa. 

The point eg is isolated in both topologies. So these three topologies on A 
coincide. 

(Clause 2) By the preceding proposition dA is complete iff dAm is. As any 
Cauchy sequence eventually lies in a single Ar = X x {r}, it converges iff this 
is true for the same sequence regarded as a sequence in X, and any limits also 
coincide. 

(Clause 3) Suppose U is compact. If infinitely many Ur were non-empty we 
could find a sequence in U with no convergent subsequence (compactness being 
equivalent to sequential compactness in metric spaces). If rex{i) is a sequence 
in some Ur then it has a convergent subsequence in U but this must converge 
to a point of Ur- Conversely, if Ur is compact in Ar then it is compact in X 
and then U is a finite union of compact sets, and so compact. □ 
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Let dp and dprn be the Hausdorff metrics arising from dA and dAm respectively. 
Let us write F' for the set of all finite subsets of A. 

Proposition 12. dp and dpm are metrics on F' iff < 2, and both coincide 
with the Hausdorff metric for the case of ordinary subsets. They are complete 
metrics on F' iff d is. 

Proof. < 2 \& necessary for the triangle inequality for dA (and so for rf^) or 
for dAm (and so for dprn) to hold. The rest of the statement is an immedi- 
ate consequence of their definitions and the stated properties of the Hausdorff 
metric. □ 



We continue to suppose 9 < 2. Of course the restrictions oi dp and dpm to 
F (multisets on X) need not be complete. For instance, if d{xi,x) is strictly 
decreasing to zero and yi = x^+i, then the sequence 2exi + ^e^. is Cauchy in dp 
or dpm but its limit is {2ex,'iex} G F'\F. 

We deal with this discrepancy in the following way. Observe that to every 
U € F' there is a function tu : X N defined by 



and indeed if U G F, tjj is its representative in model E. Define an equivalence 
relation ~ on F' by decreeing U ^ V iS ty — ty ■ For example, \i x ^ y one ~- 

2ey}, {26a;, 46a;, 2ey} , {66a;, 2ey}. Obviously 
every class is finite, contains exactly one element of F. and is a singleton iff 
tu{x) < 2Vx. 

We now write G for F' / ~ and da for the quotient pseudometric on G 
corresponding to dp. There are canonical bijections among G, F and E. We 
extend the notations cq, i?() and C() to F' and G in the obvious way. If 

6 G G\{eo}, it follows that dc{e,eo) > M since dA{rex, bq) > M for all rex G 



Now da is definitely less than d;- in general as 

dG{'iex,Sey) < dp{{ex,2ex}, {cy, 2ey}) < 2d{x, y) < M{x, y) di;^(3ea;, 36^) 

The most important facts about da are corollaries of the following result. 

Proposition 13. If ej e G\{eo}, then daiej) > dH{R{e),R{f)). 

Proof. Suppose x G R{e),y G R{f) are such that d{x,y) — dH{R{e), R{f)). We 
can assume x Rif)- Let e = po,pi, . . . ,PmPn+i — f he a sequence of elements 
of G, referring to the notation of (|l.ip . If any pi is 6o then we have two or more 
terms > M so the path length is at least 2M > sup d > d{x, y) and we now 
assume that all R{j) := R{pj) are non-empty. 

We will employ the observation that dpiu, v) > minf,gi^(i,) d{a, fe) if a ^ R{v). 
For any sequence xo,xi,..., all in UjRj, define Si by Xi G R{si) where Si is 
maximal. Take xq = x and choose xi G R{l + so) such that d{xo, Xi) is minimal. 



4 The multiset model G 




a 



A,rj^O. 
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So the diT-distaiice between any member of and any member of pi+s^ is at 
least d{xo,xi). 

If xi £ R{f) we are finished as our path is at least d(a;o,a:;i) > d(x,y). 
Otherwise choose X2 G + si) such that d{xi,X2) is minimal. 

Again we are finished if X2 G R{f) our path is (at least) d{xo,xi) + 
d{xi,X2)- If not, choose 2:3 G R{1 + S2) to minimise d{x2,X3). As the are 
increasing we get a sequence of terms from x to some z £ R{,f) whose sum is at 
least d{x, y). □ 

Corollary 14. (1) da agrees with the H aus dor ff metric on finite subsets of X . 

(2) If d is uniformly discrete then da is a complete metric on G. 

(3) dF{eJ)>dH{R{e),R{f)). 

Proof. (1) For finite subsets e, / of X, 

rff (e, /) > daie, f) > dH{R{e), R{f)) = dnie, f) - (e, /) 

(2) dn has the same lower bound as d. By clause (1), so does dg, making it 
a metric, do is complete because it is uniformly discrete. 

(3) dp is at least as big as do- □ 

In the notation of the proposition, if we have t^ix) > tf(x) and we define 
So to be the maximal s such that ts{x) > ii+s(x), we cannot use the same 
argument to show that da is a metric in general, because we might have z = x. 

5 Concluding remarks 

Aside from the potential applications mentioned at the start or described in 
|SIYS07| . these metrics might also be useful in voting theory. An election is a 
multiset on the set X of permitted ballot types. For instance, if X is the total 
orderings (permutations) of n candidates, one well-known metric on X is the 
Kendall t -distance |DD09[ p. 211], defined as the fewest transpositions required 
to change one into the other. 

Future work ought to look at possible applications and clarify the relation- 
ships among E, F and G. 
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